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^^i^V^^^ In re patent application of: ASHIDA, et al. 

Serial No.: 09/994,951 Group No.: Unknown 

Filed: November 27, 2001 Examiner: Unknown 

For: METHOD AND SYSTEM FOR DATABASE MANAGEMENT FOR DATA 
MINING 



I, Ken I. Ydshida, Registration No. 37,009 certify that 
this correspondence is being deposited with the U.S. 
Postal Service as First Class mail in an envelope 
addressed to the Assistant Commissioner for Patents, 
Washington, D.C. 20231. 

On ^Qg-^ (2, too l 




Assistant Commissioner for Patents 
Washington, D.C. 20231 



PRE! JMINARY AMENDMENT 



2 9 2002 

^'ohnology Center 2100 



Sir: 



Please make the following change prior to examination of the above-referenced 



application: 

In the Specification: 



Please an^nd the following: 



PageXline 31, between "other" and "the," please irjs^rt --than- 



Page 9, lin^3f^etween "the rule." and "A precision", please ii^e/- -200 people 



satisfy the rule portion while 50 people satisfy both the rule and condition portions.^ 
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upon the selected customer lists 107 and the speculation models 1 10, the speculation 
processing unit 1 1 1 generates speculation results 112. 



Still referring to FIGURE 1, each of the above processing units processes 
5 information in a predetermined sequence and maimer. According to a predetermined rule 
such as in an if-then format, the characteristic rule generation processing unit 103 extracts 
certain characteristic information to generate the characteristic rules 104 based upon the 
customer data 101, which includes at least one record each of which contains at least 
record entries. After the characteristic rules 104 are generated by the characteristic mle 

10 generation processing unit 103, the segment selection unit 106 determines the structure of 
the multi-dimensional database based upon the data definition information 102. The 
condition items in the data definition information 102 correspond to the key dimensions in 
the multi-dimensional database while the conclusion items correspond to the analysis 
dimensions. After the dimensional structure is determined, the characteristic rule 

15 generation processing unit 103 loads the customer data 101 and generates the multi- 
dimensional database. In other words, the above segment selection process includes two 
types of tasks. One task is to generate multidimensional database using the condition items 
as columns and rows, and the conclusion items as analysis results. The other task is to 
output the selected customer list with the selected segment data in to the above created 

2 0 multidimensional cells. A user is now involved to select one of the condition items in the 

characteristic rules 104. In response to the above user selection, a display screen is 
generated to display cell values as the conclusion items in the columns and rows which 
specify the condition items. 

25 One example of the customer data 101 is illustrated in FIGURE 2. The 

exemplary customer data 101 is generally organized by the month, including March, April 
and May. Within each month, the first column is a customer number or ID to identify a 
customer, and for each identified customer, a record including information on 
predetermined items such as gender, age, profit amount and cancellation status. Within 

3 0 March, the cancellation status reflects an event between the beginning and the end of 

March. On the other hand, information othe r than the cancellation status for the March 
records is based upon the information at the end of January. For example, the customer 



7 




HITACHI- 0018/340001335US1 



PATENT 



is between twenty and twenty-four and the gender is female, license is cancelled. A 
rule/condition in the third column is a ratio between a number of records to satisfy the rule 
and a number of records to satisfy only the condition portion of the rul e. 200 people 
satisfy the rule portion while 50 people satisfy both the rule and condition portions . A 
5 precision level in the fourth colunm is a ratio between the number of records satisfying the 
rule and the number of records satisfying the condition. 

Now referring to FIGURE 5, an exemplary multidimensional display is 
illustrated. In this example, the above rule No. 1 is selected in FIGURE 4. The selected 

10 rule is that if the age is between twenfy and twenty-four and the gender is female, license is 
cancelled. Based upon the above selected rule, a multidimensional display screen displays 
condition items as well as conclusion items, and the multidirnensional display includes 
rows for displaying age groups and colunms for displaying gender. In each cell, the above 
ratio between the number of cancelled customers for the rule and a total number of 

1 5 customer is displayed as a conclusion item. The above ratio value is automatically 
calculated by the system according to the ciurent invention. The cells that meet the 
conditions used in the selected rule are in a certain predetermined color in order to 
distinguish at a first glance fi-om other conditions that are not used in the rule. Other 
conditions are displayed as pages of the multidimensional database. 



Still referring to FIGURE 5, the display is modifiable. A user compares the cell 
values of particular interest under the selected conditions to other cell values in order to 
determine the validity or significance of the selected rule. Furthermore, the user constructs 
other displays or speculation models and selects a segment to be used for the speculation 

2 5 models by observing cell value changes after adding and deleting the conditions. The 

addition and deletion of the conditions are generally based upon the user's opinion and 
experience or even upon trials and errors. The conditions are changed by multi- 
dimensional database functions such as drill up, drill down, slice and dice. In adding a 
condition, one way is to drill down a page of the multi-dimensional database and to select a 

3 0 slice. In deleting a condition, either a column or a row of a page in the multi-dimensional 

database is drilled up. For example, the user moves a pointing device such as a mouse on a 
triangle or an area indicating "ALL" in the profit amoimt and clicks the right mouse button 
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Within the function menu, the user selects a desired function by the left mouse button. 
Assimaing that the user selects the selected customer list generation in the function menu 
and the March data is currently being displayed, the selected customer Ust 107 is selected 
from the customer data 101 from May or two months after the current data and only from a 
5 portion that satisfies the imposed conditions 108. The month for the above analysis is 
automatically selected to be nvo months after the currently selected month. As described 
above with respect to FIGURE- 2. certain portions of the data other than a specified data 
such as the cancellation status ai'e automatically taken from two-months earlier. Next, 
assuming that the user selects the speculation mode generation in the function menu, the 

10 speculation model generation unit 109 automatically generates an optimal speculation 

model based upon the conditions that the user has selected for the above described segment 
selection process or unit 106. Lastly, assuming that the user selects the speculation in the 
function menu, the speculation processing unit 1 1 1 automatically conclude concludes the 
speculation results 112 based upon the selected customer list 107 and the speculation 

1 5 models 110. The speculation algorithm is substantially the same as the algorithm used for 
speculating the potential cancelled customers or possibility for the cancelled customers. 
The speculation algorithms include the prior art techniques that have been disclosed in the 
background section of the current appUcation. The speculation item in the function menu 
remains disabled until the selected customer list 107 and the speculation models 1 10 have 

2 0 been selected and successfully completed. 

Now referring to FIGURE 7, a flow chart illustrates steps involved in a preferred 
process of the speculation model generation/selection according to the current invention. 
The steps are described with respect to the units and the data as shown in FIGURE 1 . In a 

2 5 step 701, a portion of the customer data 101 is selected according to the data definition 

information 102. In the step 701, the selected portion is further refined to extract records 
that satisfy the conditions as set forth in the selected segments 108. In a step 702, the 
extracted records in the step 70 1 are divided into model candidate data and vaUdating data. 
For example, the division is accomplished by randomly sampUng sixty percent of the 

3 0 records as the model candidate data while the remaining forty percent as the validation 

data. After the division in the step 702, the conditions as defined in the data definition 
information 1 02 are comprehensively combined to generate in combination with the 
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Page y(flme 4, between "108." and 'TSfext " pi 



ease insm - -The month for the 




above analysis is automatically selected to be two months after the currently selected 
month. As described above with respect to FIGURE 2, certain portions of the data other 
than a specified data such as the cancellation status are automatically taken fi-om two 
months earlier.- - 

Page Liffine 9, please chailge "conclude" to - -"concludes".- - 



KNOBLE & YOSHIDA LLC 
Eight Penn Center, Suite 1350 
1628 John F. Kennedy Blvd. 
Philadelphia, PA 19103 
(215) 599-0600 



Respectfully submitted. 





